Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

就是对ctc loss的输入做了过滤,用下标取了值,前向过程中ctc报错显存不足 #14087

Open
3 tasks done
zzk2021 opened this issue Oct 25, 2024 · 1 comment

Comments

@zzk2021
Copy link

zzk2021 commented Oct 25, 2024

🔎 Search before asking

  • I have searched the PaddleOCR Docs and found no similar bug report.
  • I have searched the PaddleOCR Issues and found no similar bug report.
  • I have searched the PaddleOCR Discussions and found no similar bug report.

🐛 Bug (问题描述)

Traceback (most recent call last):
File "/data/PaddleOCR/PaddleOCR-2.8.1/tools/train.py", line 292, in
main(config, device, logger, vdl_writer, seed)
File "/data/PaddleOCR/PaddleOCR-2.8.1/tools/train.py", line 221, in main
lwf_program.train(
File "/data/PaddleOCR/PaddleOCR-2.8.1/tools/LwF_program.py", line 357, in train
loss = loss_class(preds, batch)
File "/home/server/anaconda3/envs/paddle/lib/python3.10/site-packages/paddle/nn/layer/layers.py", line 1429, in call
return self.forward(*inputs, **kwargs)
File "/data/PaddleOCR/PaddleOCR-2.8.1/ppocr/losses/rec_multi_loss.py", line 53, in forward
loss_func(predicts["ctc"], [batch[:2] + batch[3:], predicts["preds_distill"]["ctc"]])["loss"]
File "/home/server/anaconda3/envs/paddle/lib/python3.10/site-packages/paddle/nn/layer/layers.py", line 1429, in call
return self.forward(*inputs, **kwargs)
File "/data/PaddleOCR/PaddleOCR-2.8.1/ppocr/losses/rec_ctc_loss.py", line 63, in forward
ctc_loss = self.loss_func.forward(predicts, batch[0])
File "/data/PaddleOCR/PaddleOCR-2.8.1/ppocr/losses/rec_ctc_loss.py", line 92, in forward
loss = self.loss_func(predicts, labels, preds_lengths, label_lengths)
File "/home/server/anaconda3/envs/paddle/lib/python3.10/site-packages/paddle/nn/layer/layers.py", line 1429, in call
return self.forward(*inputs, **kwargs)
File "/home/server/anaconda3/envs/paddle/lib/python3.10/site-packages/paddle/nn/layer/loss.py", line 1261, in forward
return paddle.nn.functional.ctc_loss(
File "/home/server/anaconda3/envs/paddle/lib/python3.10/site-packages/paddle/nn/functional/loss.py", line 1940, in ctc_loss
loss_out = warpctc(
File "/home/server/anaconda3/envs/paddle/lib/python3.10/site-packages/paddle/nn/functional/loss.py", line 1901, in warpctc
loss_out = _C_ops.warpctc(
MemoryError: (ResourceExhausted) Fail to alloc memory of 559140344358520 size, error code is 12.
[Hint: Expected error == 0, but received error:12 != 0:0.] (at ../paddle/fluid/memory/allocation/cpu_allocator.cc:50)

🏃‍♂️ Environment (运行环境)

Linux CentOS 3090 x4 paddleocr 2.8.1

🌰 Minimal Reproducible Example (最小可复现问题的Demo)

        filtered_indices = paddle.to_tensor(filtered_indices)
        
        batch[0][1] = batch[0][1][filtered_indices]
        batch[0][2] = batch[0][2][filtered_indices]
        
        ctc_loss = self.loss_func.forward(predicts[filtered_indices], batch[0])
@UserWangZz
Copy link
Collaborator

可以详细说说遇到的问题吗

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants