Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OSError: [Errno 22] Invalid argument #17

Open
AnwarUllahKhan opened this issue Aug 27, 2018 · 7 comments
Open

OSError: [Errno 22] Invalid argument #17

AnwarUllahKhan opened this issue Aug 27, 2018 · 7 comments

Comments

@AnwarUllahKhan
Copy link

@hanzhanggit
@taoxugit
please help me, what is the main problem behind this?

(base) H:\StackGAN\StackGAN-Pytorch-master\code>python main.py --cfg cfg/coco_eval.yml --gpu 0
Using config:
{'CONFIG_NAME': 'stageI',
'CUDA': True,
'DATASET_NAME': 'coco',
'DATA_DIR': '../data/coco',
'EMBEDDING_TYPE': 'cnn-rnn',
'GAN': {'CONDITION_DIM': 128, 'DF_DIM': 96, 'GF_DIM': 192, 'R_NUM': 4},
'GPU_ID': '0',
'IMSIZE': 64,
'NET_D': '',
'NET_G': '',
'STAGE': 1,
'STAGE1_G': '',
'TEXT': {'DIMENSION': 1024},
'TRAIN': {'BATCH_SIZE': 128,
'COEFF': {'KL': 2.0},
'DISCRIMINATOR_LR': 0.0002,
'FLAG': True,
'GENERATOR_LR': 0.0002,
'LR_DECAY_EPOCH': 20,
'MAX_EPOCH': 120,
'PRETRAINED_EPOCH': 600,
'PRETRAINED_MODEL': '',
'SNAPSHOT_INTERVAL': 10},
'VIS_COUNT': 64,
'WORKERS': 4,
'Z_DIM': 100}
Load filenames from: ../data/coco\train\filenames.pickle (82783)
embeddings: (82783, 5, 1024)
This section is run successfully...
STAGE1_G(
(ca_net): CA_NET(
(fc): Linear(in_features=1024, out_features=256, bias=True)
(relu): ReLU()
)
(fc): Sequential(
(0): Linear(in_features=228, out_features=24576, bias=False)
(1): BatchNorm1d(24576, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace)
)
(upsample1): Sequential(
(0): Upsample(scale_factor=2, mode=nearest)
(1): Conv2d(1536, 768, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(2): BatchNorm2d(768, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): ReLU(inplace)
)
(upsample2): Sequential(
(0): Upsample(scale_factor=2, mode=nearest)
(1): Conv2d(768, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(2): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): ReLU(inplace)
)
(upsample3): Sequential(
(0): Upsample(scale_factor=2, mode=nearest)
(1): Conv2d(384, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(2): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): ReLU(inplace)
)
(upsample4): Sequential(
(0): Upsample(scale_factor=2, mode=nearest)
(1): Conv2d(192, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(2): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): ReLU(inplace)
)
(img): Sequential(
(0): Conv2d(96, 3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): Tanh()
)
)
STAGE1_D(
(encode_img): Sequential(
(0): Conv2d(3, 96, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(1): LeakyReLU(negative_slope=0.2, inplace)
(2): Conv2d(96, 192, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(3): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(4): LeakyReLU(negative_slope=0.2, inplace)
(5): Conv2d(192, 384, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(6): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(7): LeakyReLU(negative_slope=0.2, inplace)
(8): Conv2d(384, 768, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(9): BatchNorm2d(768, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(10): LeakyReLU(negative_slope=0.2, inplace)
)
(get_cond_logits): D_GET_LOGITS(
(outlogits): Sequential(
(0): Conv2d(896, 768, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm2d(768, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): LeakyReLU(negative_slope=0.2, inplace)
(3): Conv2d(768, 1, kernel_size=(4, 4), stride=(4, 4))
(4): Sigmoid()
)
)
)
Preparing training data...
Traceback (most recent call last):
File "main.py", line 77, in
algo.train(dataloader, cfg.STAGE)
File "H:\StackGAN\StackGAN-Pytorch-master\code\trainer.py", line 158, in train
for i, data in enumerate(data_loader, 0):
File "C:\ProgramData\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 501, in iter
return _DataLoaderIter(self)
File "C:\ProgramData\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 289, in init
w.start()
File "C:\ProgramData\Anaconda3\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "C:\ProgramData\Anaconda3\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\ProgramData\Anaconda3\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "C:\ProgramData\Anaconda3\lib\multiprocessing\popen_spawn_win32.py", line 65, in init
reduction.dump(process_obj, to_child)
File "C:\ProgramData\Anaconda3\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
OSError: [Errno 22] Invalid argument

@matteobsu
Copy link

Same issue. Did you solve it somehow?

@AnwarUllahKhan
Copy link
Author

@matteobsu No dear it's still error I think it's because of memory space. now I am training the tensorflow version of this.

@matteobsu
Copy link

@matteobsu No dear it's still error I think it's because of memory space. now I am training the tensorflow version of this.

So I Think I actually solved it. Try to open:

C:\ProgramData\Anaconda3\lib\multiprocessing\reduction.py

and replace

ForkingPickler(file, protocol).dump(obj)
with
ForkingPickler(file, protocol).dumps(obj)

at line 60

@ARMkernal
Copy link

I changed it to dumps but I got this error:
Traceback (most recent call last):
File "", line 1, in
File "D:\Anaconda\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "D:\Anaconda\lib\multiprocessing\spawn.py", line 113, in _main
preparation_data = reduction.pickle.load(from_parent)
EOFError: Ran out of input
Traceback (most recent call last):
File "", line 1, in
File "D:\Anaconda\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "D:\Anaconda\lib\multiprocessing\spawn.py", line 113, in _main
preparation_data = reduction.pickle.load(from_parent)
EOFError: Ran out of input
Traceback (most recent call last):
File "D:\Anaconda\lib\site-packages\torch\utils\data\dataloader.py", line 724, in _try_get_data
data = self._data_queue.get(timeout=timeout)
File "D:\Anaconda\lib\multiprocessing\queues.py", line 105, in get
raise Empty
_queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "E:/Python文件/机器学习/lab06NN/MyNN.py", line 57, in
for i, data in enumerate(trainloader, 0):
File "D:\Anaconda\lib\site-packages\torch\utils\data\dataloader.py", line 804, in next
idx, data = self._get_data()
File "D:\Anaconda\lib\site-packages\torch\utils\data\dataloader.py", line 771, in _get_data
success, data = self._try_get_data()
File "D:\Anaconda\lib\site-packages\torch\utils\data\dataloader.py", line 737, in _try_get_data
raise RuntimeError('DataLoader worker (pid(s) {}) exited unexpectedly'.format(pids_str))
RuntimeError: DataLoader worker (pid(s) 7668, 8988) exited unexpectedly

How to solve the runtime error?

@zianzheng0806
Copy link

I changed it to dumps but I got this error:
Traceback (most recent call last):
File "", line 1, in
File "D:\Anaconda\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "D:\Anaconda\lib\multiprocessing\spawn.py", line 113, in _main
preparation_data = reduction.pickle.load(from_parent)
EOFError: Ran out of input
Traceback (most recent call last):
File "", line 1, in
File "D:\Anaconda\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "D:\Anaconda\lib\multiprocessing\spawn.py", line 113, in _main
preparation_data = reduction.pickle.load(from_parent)
EOFError: Ran out of input
Traceback (most recent call last):
File "D:\Anaconda\lib\site-packages\torch\utils\data\dataloader.py", line 724, in _try_get_data
data = self._data_queue.get(timeout=timeout)
File "D:\Anaconda\lib\multiprocessing\queues.py", line 105, in get
raise Empty
_queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "E:/Python文件/机器学习/lab06NN/MyNN.py", line 57, in
for i, data in enumerate(trainloader, 0):
File "D:\Anaconda\lib\site-packages\torch\utils\data\dataloader.py", line 804, in next
idx, data = self._get_data()
File "D:\Anaconda\lib\site-packages\torch\utils\data\dataloader.py", line 771, in _get_data
success, data = self._try_get_data()
File "D:\Anaconda\lib\site-packages\torch\utils\data\dataloader.py", line 737, in _try_get_data
raise RuntimeError('DataLoader worker (pid(s) {}) exited unexpectedly'.format(pids_str))
RuntimeError: DataLoader worker (pid(s) 7668, 8988) exited unexpectedly

How to solve the runtime error?

Have you solved the this new error? I have met exactly the same problems with you.

@matzl
Copy link

matzl commented Jan 25, 2024

Had a similar issue using multiprocessing, in my case I saw the RAM going to 100.0% in the task manager just before the issue appears. I never had this issue on another computer with more RAM, so I guess it is an error related to RAM overflow (or a process that can't wait long enough until something useful is returned from the memory in case the RAM is full).
Workaround: decrease RAM usage (either in your python script or by closing other programs).

@MeshkatShB
Copy link

MeshkatShB commented May 14, 2024

I was able to make it work after a week of trial and error going through the internet and testing each possible way. I did two things. In my case, I was using LdaMulticore which is in gensim.models:
from gensim.models import LdaMulticore
My code was:
lda = LdaMulticore(input_text, num_topics=topic_num, id2word=dictionary, passes=1, workers=8)

I solved it by removing passes=1 and workers=8 totally which led to my code being:
lda = LdaMulticore(input_text, num_topics=topic_num, id2word=dictionary)
as suggested in: https://stackoverflow.com/questions/70218051/oserror-errno-22-invalid-argument-pickle-unpicklingerror-pickle-data-was.
At first, it didn't work. Then, I decided to change my virtual environment. I created a fresh new environment. Added all the libraries that I used (without requirements.txt since I was suspecting maybe it is a dependency thing). Then VOILA! It is now running like a charm!

In conclusion, first, remove the workers and passes in your code. Secondly, create a fresh new venv (virtual environment) and install all your packages, then try. I hope this works for everyone.
@matzl @zianzheng0806 @ARMkernal

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants