Skip to content

OSError: [Errno 22] Invalid argument #17

Open
@AnwarUllahKhan

Description

@AnwarUllahKhan

@hanzhanggit
@taoxugit
please help me, what is the main problem behind this?

(base) H:\StackGAN\StackGAN-Pytorch-master\code>python main.py --cfg cfg/coco_eval.yml --gpu 0
Using config:
{'CONFIG_NAME': 'stageI',
'CUDA': True,
'DATASET_NAME': 'coco',
'DATA_DIR': '../data/coco',
'EMBEDDING_TYPE': 'cnn-rnn',
'GAN': {'CONDITION_DIM': 128, 'DF_DIM': 96, 'GF_DIM': 192, 'R_NUM': 4},
'GPU_ID': '0',
'IMSIZE': 64,
'NET_D': '',
'NET_G': '',
'STAGE': 1,
'STAGE1_G': '',
'TEXT': {'DIMENSION': 1024},
'TRAIN': {'BATCH_SIZE': 128,
'COEFF': {'KL': 2.0},
'DISCRIMINATOR_LR': 0.0002,
'FLAG': True,
'GENERATOR_LR': 0.0002,
'LR_DECAY_EPOCH': 20,
'MAX_EPOCH': 120,
'PRETRAINED_EPOCH': 600,
'PRETRAINED_MODEL': '',
'SNAPSHOT_INTERVAL': 10},
'VIS_COUNT': 64,
'WORKERS': 4,
'Z_DIM': 100}
Load filenames from: ../data/coco\train\filenames.pickle (82783)
embeddings: (82783, 5, 1024)
This section is run successfully...
STAGE1_G(
(ca_net): CA_NET(
(fc): Linear(in_features=1024, out_features=256, bias=True)
(relu): ReLU()
)
(fc): Sequential(
(0): Linear(in_features=228, out_features=24576, bias=False)
(1): BatchNorm1d(24576, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace)
)
(upsample1): Sequential(
(0): Upsample(scale_factor=2, mode=nearest)
(1): Conv2d(1536, 768, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(2): BatchNorm2d(768, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): ReLU(inplace)
)
(upsample2): Sequential(
(0): Upsample(scale_factor=2, mode=nearest)
(1): Conv2d(768, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(2): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): ReLU(inplace)
)
(upsample3): Sequential(
(0): Upsample(scale_factor=2, mode=nearest)
(1): Conv2d(384, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(2): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): ReLU(inplace)
)
(upsample4): Sequential(
(0): Upsample(scale_factor=2, mode=nearest)
(1): Conv2d(192, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(2): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): ReLU(inplace)
)
(img): Sequential(
(0): Conv2d(96, 3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): Tanh()
)
)
STAGE1_D(
(encode_img): Sequential(
(0): Conv2d(3, 96, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(1): LeakyReLU(negative_slope=0.2, inplace)
(2): Conv2d(96, 192, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(3): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(4): LeakyReLU(negative_slope=0.2, inplace)
(5): Conv2d(192, 384, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(6): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(7): LeakyReLU(negative_slope=0.2, inplace)
(8): Conv2d(384, 768, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(9): BatchNorm2d(768, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(10): LeakyReLU(negative_slope=0.2, inplace)
)
(get_cond_logits): D_GET_LOGITS(
(outlogits): Sequential(
(0): Conv2d(896, 768, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm2d(768, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): LeakyReLU(negative_slope=0.2, inplace)
(3): Conv2d(768, 1, kernel_size=(4, 4), stride=(4, 4))
(4): Sigmoid()
)
)
)
Preparing training data...
Traceback (most recent call last):
File "main.py", line 77, in
algo.train(dataloader, cfg.STAGE)
File "H:\StackGAN\StackGAN-Pytorch-master\code\trainer.py", line 158, in train
for i, data in enumerate(data_loader, 0):
File "C:\ProgramData\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 501, in iter
return _DataLoaderIter(self)
File "C:\ProgramData\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 289, in init
w.start()
File "C:\ProgramData\Anaconda3\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "C:\ProgramData\Anaconda3\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\ProgramData\Anaconda3\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "C:\ProgramData\Anaconda3\lib\multiprocessing\popen_spawn_win32.py", line 65, in init
reduction.dump(process_obj, to_child)
File "C:\ProgramData\Anaconda3\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
OSError: [Errno 22] Invalid argument

Activity

matteobsu

matteobsu commented on Dec 4, 2018

@matteobsu

Same issue. Did you solve it somehow?

AnwarUllahKhan

AnwarUllahKhan commented on Dec 4, 2018

@AnwarUllahKhan
Author

@matteobsu No dear it's still error I think it's because of memory space. now I am training the tensorflow version of this.

matteobsu

matteobsu commented on Dec 5, 2018

@matteobsu

@matteobsu No dear it's still error I think it's because of memory space. now I am training the tensorflow version of this.

So I Think I actually solved it. Try to open:

C:\ProgramData\Anaconda3\lib\multiprocessing\reduction.py

and replace

ForkingPickler(file, protocol).dump(obj)
with
ForkingPickler(file, protocol).dumps(obj)

at line 60

ARMkernal

ARMkernal commented on Oct 17, 2019

@ARMkernal

I changed it to dumps but I got this error:
Traceback (most recent call last):
File "", line 1, in
File "D:\Anaconda\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "D:\Anaconda\lib\multiprocessing\spawn.py", line 113, in _main
preparation_data = reduction.pickle.load(from_parent)
EOFError: Ran out of input
Traceback (most recent call last):
File "", line 1, in
File "D:\Anaconda\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "D:\Anaconda\lib\multiprocessing\spawn.py", line 113, in _main
preparation_data = reduction.pickle.load(from_parent)
EOFError: Ran out of input
Traceback (most recent call last):
File "D:\Anaconda\lib\site-packages\torch\utils\data\dataloader.py", line 724, in _try_get_data
data = self._data_queue.get(timeout=timeout)
File "D:\Anaconda\lib\multiprocessing\queues.py", line 105, in get
raise Empty
_queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "E:/Python文件/机器学习/lab06NN/MyNN.py", line 57, in
for i, data in enumerate(trainloader, 0):
File "D:\Anaconda\lib\site-packages\torch\utils\data\dataloader.py", line 804, in next
idx, data = self._get_data()
File "D:\Anaconda\lib\site-packages\torch\utils\data\dataloader.py", line 771, in _get_data
success, data = self._try_get_data()
File "D:\Anaconda\lib\site-packages\torch\utils\data\dataloader.py", line 737, in _try_get_data
raise RuntimeError('DataLoader worker (pid(s) {}) exited unexpectedly'.format(pids_str))
RuntimeError: DataLoader worker (pid(s) 7668, 8988) exited unexpectedly

How to solve the runtime error?

zianzheng0806

zianzheng0806 commented on Aug 10, 2021

@zianzheng0806

I changed it to dumps but I got this error:
Traceback (most recent call last):
File "", line 1, in
File "D:\Anaconda\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "D:\Anaconda\lib\multiprocessing\spawn.py", line 113, in _main
preparation_data = reduction.pickle.load(from_parent)
EOFError: Ran out of input
Traceback (most recent call last):
File "", line 1, in
File "D:\Anaconda\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "D:\Anaconda\lib\multiprocessing\spawn.py", line 113, in _main
preparation_data = reduction.pickle.load(from_parent)
EOFError: Ran out of input
Traceback (most recent call last):
File "D:\Anaconda\lib\site-packages\torch\utils\data\dataloader.py", line 724, in _try_get_data
data = self._data_queue.get(timeout=timeout)
File "D:\Anaconda\lib\multiprocessing\queues.py", line 105, in get
raise Empty
_queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "E:/Python文件/机器学习/lab06NN/MyNN.py", line 57, in
for i, data in enumerate(trainloader, 0):
File "D:\Anaconda\lib\site-packages\torch\utils\data\dataloader.py", line 804, in next
idx, data = self._get_data()
File "D:\Anaconda\lib\site-packages\torch\utils\data\dataloader.py", line 771, in _get_data
success, data = self._try_get_data()
File "D:\Anaconda\lib\site-packages\torch\utils\data\dataloader.py", line 737, in _try_get_data
raise RuntimeError('DataLoader worker (pid(s) {}) exited unexpectedly'.format(pids_str))
RuntimeError: DataLoader worker (pid(s) 7668, 8988) exited unexpectedly

How to solve the runtime error?

Have you solved the this new error? I have met exactly the same problems with you.

matzl

matzl commented on Jan 25, 2024

@matzl

Had a similar issue using multiprocessing, in my case I saw the RAM going to 100.0% in the task manager just before the issue appears. I never had this issue on another computer with more RAM, so I guess it is an error related to RAM overflow (or a process that can't wait long enough until something useful is returned from the memory in case the RAM is full).
Workaround: decrease RAM usage (either in your python script or by closing other programs).

MeshkatShB

MeshkatShB commented on May 14, 2024

@MeshkatShB

I was able to make it work after a week of trial and error going through the internet and testing each possible way. I did two things. In my case, I was using LdaMulticore which is in gensim.models:
from gensim.models import LdaMulticore
My code was:
lda = LdaMulticore(input_text, num_topics=topic_num, id2word=dictionary, passes=1, workers=8)

I solved it by removing passes=1 and workers=8 totally which led to my code being:
lda = LdaMulticore(input_text, num_topics=topic_num, id2word=dictionary)
as suggested in: https://stackoverflow.com/questions/70218051/oserror-errno-22-invalid-argument-pickle-unpicklingerror-pickle-data-was.
At first, it didn't work. Then, I decided to change my virtual environment. I created a fresh new environment. Added all the libraries that I used (without requirements.txt since I was suspecting maybe it is a dependency thing). Then VOILA! It is now running like a charm!

In conclusion, first, remove the workers and passes in your code. Secondly, create a fresh new venv (virtual environment) and install all your packages, then try. I hope this works for everyone.
@matzl @zianzheng0806 @ARMkernal

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @matzl@matteobsu@MeshkatShB@AnwarUllahKhan@ARMkernal

        Issue actions

          OSError: [Errno 22] Invalid argument · Issue #17 · hanzhanggit/StackGAN-Pytorch