Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Labels: background as 1, foreground as 0? #17

Open
yt2639 opened this issue Dec 19, 2021 · 1 comment
Open

Labels: background as 1, foreground as 0? #17

yt2639 opened this issue Dec 19, 2021 · 1 comment

Comments

@yt2639
Copy link

yt2639 commented Dec 19, 2021

Hi guys, thanks for your work and sharing the code. I have a question about the labels input to calculate the loss. So I understand it as if we have multi-class detection problem, say 5 categories, then the foreground would be 0,1,2,3,4 and the background will be 5. So similarly, if we only have 1 class, then foreground would be 0, background would be 1.

I was just wondering whether this "fg-0 bg-1" has been flipped (as in "fg-1 bg-0") in calculating the loss? Cuz I saw from the vadacore.ops.sigmoid_focal_loss, specifically in sigmoid_focal_loss_cuda.cu file, it wrote

__global__ void SigmoidFocalLossForward(const int nthreads,
                                        const scalar_t *logits,
                                        const int64_t *targets,
                                        const int num_classes,
                                        const float gamma, const float alpha,
                                        const int num, scalar_t *losses) {
  CUDA_1D_KERNEL_LOOP(i, nthreads) {
    int n = i / num_classes;
    int d = i % num_classes;  // current class[0~79];
    int t = targets[n];       // target class [0~79];

    // Decide it is positive or negative case.
    scalar_t c1 = (t == d);
    scalar_t c2 = (t >= 0 & t != d);

And I guess this int d = i % num_classes; // current class[0~79] is where the labels are flipped (so labels become bg-0 fg-1)?

The reason why I have this question is when I look at the loss, if the labels aren't flipped, it doesn't make sense. For the simplest case, Binary Cross Entropy loss, it should be

loss = - [y log(p) + (1-y) log(1-p)]

Minimizing the loss is equivalent to maximizing y log(p) + (1-y) log(1-p). So here, when y=1, we maximize p; when y=0, we maximize 1-p i.e. minimize p. And so here, if the input labels are in "bg-1 fg-0", we should make it "bg-0 fg-1". Is this correct?

Thanks!

@yt2639
Copy link
Author

yt2639 commented Dec 20, 2021

@hxcai @mileistone Could you answer this question? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant