add position_ids in forward #456

jiqing-feng · 2023-10-17T03:39:47Z

Do you think that we should add position_ids in the forward of the generation model? The optimum has supported to generate position_ids in this PR.

cc @changwangss

HuggingFaceDocBuilderDev · 2023-10-17T03:46:24Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

echarlaix

thanks for the addition @jiqing-feng, you're right we need to add the support of position_ids now that it has been integrated in optimum

echarlaix · 2023-10-17T15:02:20Z

optimum/intel/generation/modeling.py

@@ -88,7 +89,7 @@ def jit_trace(model: PreTrainedModel, task: str, use_cache: bool = False):
    traced_model(**model_inputs)
    traced_model(**model_inputs)

-    return traced_model
+    return traced_model, has_position_ids


I think we should keep jit_trace as it is

Suggested change

return traced_model, has_position_ids

return traced_model

echarlaix · 2023-10-17T15:06:43Z

optimum/intel/generation/modeling.py

@@ -116,6 +118,7 @@ def __init__(
        self._device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
        self.normalized_config = NormalizedConfigManager.get_normalized_config_class(config.model_type)(config)
        self.model_dtype = kwargs.get("model_dtype", None)
+        self.has_position_ids = has_position_ids


no need to have an attribute, we can use MODEL_TYPES_REQUIRING_POSITION_IDS directly

Thanks, I will use it.

echarlaix · 2023-10-17T15:17:27Z

optimum/intel/generation/modeling.py

+        position_ids = kwargs.get("position_ids", None)
+        if self.has_position_ids and position_ids is not None:
+            inputs.update({"position_ids": position_ids})
+        elif self.has_position_ids and position_ids is None:
+            seq_length = input_ids.shape[-1]
+            if not self.use_cache:
+                past_key_values_length = 0
+            else:
+                past_key_values_length = past_key_values[0][1].shape[-2]
+            position_ids = torch.arange(
+                past_key_values_length, seq_length + past_key_values_length, dtype=torch.long, device=self._device
+            ).unsqueeze(0)
+            inputs.update({"position_ids": position_ids})
+        elif not self.has_position_ids and position_ids is not None:
+            logger.warning("You miss the position_ids in the inputs")


I don't think we should generate the position_ids here as you already added it in prepare_inputs_for_generation, I would just give it when needed by checking the graph as done in https://github.com/huggingface/optimum/blob/e7bd60dd2c1e295263ba57a4e468a62ab5b179e8/optimum/onnxruntime/modeling_decoder.py#L229-L232

Yes, it is more reasonable. However, for generation tasks, different decoding way will cause different inputs. For example, llama in greedy_search contains position_ids in inputs but assisted_decoding only have input_ids. Besides, we already generate attention_mask in the forward. WDYT?

I see your point, I'm ok with the modification but think we need to add a test for every architecture to verify we create it correctly. For example is past_key_values_length = past_key_values[0][1].shape[-2] for every architecture ? (looks like it from the empty pkv generation above but would like to verify, also to make sure this is compatible in case we add support for new architectures)

jiqing-feng · 2023-11-15T03:41:03Z

Hi @echarlaix . I have fixed all tests, would you please help me review these changes? Thx!

jiqing-feng · 2023-11-20T02:51:29Z

Hi @echarlaix , could you have a look at this change? Thx 😄 !

jiqing-feng · 2023-12-12T00:56:27Z

Hi @echarlaix . Would you please help me review these changes? This change could avoid forward failure because of the position_ids. For now, position_ids is just like past_key_values in some models, it should be contained in the model inputs since jit trace dummy inputs have position_ids.

echarlaix · 2023-12-12T14:07:37Z

optimum/intel/generation/modeling.py

+        if has_position_ids and position_ids is not None:
+            inputs.update({"position_ids": position_ids})
+        elif has_position_ids and position_ids is None:
+            seq_length = input_ids.shape[-1]
+            if not self.use_cache:
+                past_key_values_length = 0
+            else:
+                past_key_values_length = (
+                    past_key_values[0].shape[-2]
+                    if model_type.replace("-", "_") in MULTI_QUERY_ATTN_MODELS
+                    else past_key_values[0][1].shape[-2]
+                )
+            position_ids = torch.arange(
+                past_key_values_length, seq_length + past_key_values_length, dtype=torch.long, device=self._device
+            ).unsqueeze(0)
+            inputs.update({"position_ids": position_ids})
+        elif not has_position_ids and position_ids is not None:
+            logger.warning("You miss the position_ids in the inputs")


I think it would be better to check directly in the model graph to check whether position_ids is one of the model's expected input if that can be done. If not this new addition will create issues for all the previously exported models (the ones that were exported without any position_ids)for all architectures from MODEL_TYPES_REQUIRING_POSITION_IDS`

Suggested change

if has_position_ids and position_ids is not None:

inputs.update({"position_ids": position_ids})

elif has_position_ids and position_ids is None:

seq_length = input_ids.shape[-1]

if not self.use_cache:

past_key_values_length = 0

else:

past_key_values_length = (

past_key_values[0].shape[-2]

if model_type.replace("-", "_") in MULTI_QUERY_ATTN_MODELS

else past_key_values[0][1].shape[-2]

)

position_ids = torch.arange(

past_key_values_length, seq_length + past_key_values_length, dtype=torch.long, device=self._device

).unsqueeze(0)

inputs.update({"position_ids": position_ids})

elif not has_position_ids and position_ids is not None:

logger.warning("You miss the position_ids in the inputs")

if "position_ids" in self.input_names:

if position_ids is None:

position_ids = ...

inputs["position_ids"] = position_ids

also concerning the position_ids (and the attention_mask) computation I think we should do as follow :

optimum-intel/optimum/intel/openvino/modeling_decoder.py

Lines 365 to 383 in 819e3e8

if "attention_mask" in self.input_names or "position_ids" in self.input_names:

if attention_mask is not None:

attention_mask = np.array(attention_mask)

else:

attention_mask = np.ones(

(input_ids.shape[0], input_ids.shape[1] + past_len), dtype=inputs["input_ids"].dtype

)

if "attention_mask" in self.input_names:

inputs["attention_mask"] = attention_mask

if "position_ids" in self.input_names:

if position_ids is not None:

position_ids = np.array(position_ids)

else:

position_ids = np.cumsum(attention_mask, axis=1) - 1

position_ids[attention_mask == 0] = 1

if past_key_values:

position_ids = np.expand_dims(position_ids[:, -1], axis=-1)

I am afraid there is no input_names attr in this class.

yes could this be added you think ?

Yes, I can try it

echarlaix · 2023-12-12T14:07:43Z

optimum/intel/generation/modeling.py

@@ -264,6 +262,7 @@ def forward(
        }

        model_type = self.config.model_type.replace("_", "-")
+        has_position_ids = True if model_type in MODEL_TYPES_REQUIRING_POSITION_IDS else False


Suggested change

has_position_ids = True if model_type in MODEL_TYPES_REQUIRING_POSITION_IDS else False

has_position_ids = model_type in MODEL_TYPES_REQUIRING_POSITION_IDS

jiqing-feng · 2023-12-14T00:41:19Z

Hi @echarlaix . I am afraid input_names is not a great way. For example, the model_input_names in GPT2 is ["input_ids", "attention_mask"], see here. However, GPT2 will prepare position_ids in generation task which is not in the model_input_names, see here.

I think it would be better to keep this way and don't use input_names since our goal is to enable jit trace model support generation task, WDYT?

echarlaix · 2023-12-14T11:08:27Z

Hi @echarlaix . I am afraid input_names is not a great way. For example, the model_input_names in GPT2 is ["input_ids", "attention_mask"], see here. However, GPT2 will prepare position_ids in generation task which is not in the model_input_names, see here.

I think it would be better to keep this way and don't use input_names since our goal is to enable jit trace model support generation task, WDYT?

I was suggesting to check the model graph directly as done in https://github.com/huggingface/optimum-intel/blob/v1.12.1/optimum/intel/openvino/modeling_base.py#L82 (to check whether position_ids is one of the model's expected input) If that can't be done, this PR might results in issues for all the previously exported models (the ones that were exported without any position_ids) for all architectures from MODEL_TYPES_REQUIRING_POSITION_IDS

jiqing-feng · 2023-12-20T05:40:57Z

Hi @echarlaix . I think I got what you mean. The forward inputs were checked by the graph model inputs, could you please help me to review these changes? Thx!

jiqing-feng · 2023-12-22T05:24:44Z

Hi @echarlaix . Sorry for the misunderstanding. I just found that there is no way to get the input names from a Torch Script model, so I can only get the input names when tracing the model. Would like to hear your opinion. Thx!

echarlaix · 2024-01-04T17:06:19Z

Hi @echarlaix . Sorry for the misunderstanding. I just found that there is no way to get the input names from a Torch Script model, so I can only get the input names when tracing the model. Would like to hear your opinion. Thx!

I was able to have something with :

 input_names = [inputs.debugName() for inputs in model.graph.inputs()]

can you check it out ?

jiqing-feng · 2024-01-05T05:45:44Z

Hi @echarlaix . Thanks for your advice, it perfectly fixed my problem. Would you please review these changes? Thx!

And the failed CIs are not related to my changes

echarlaix · 2024-01-05T10:29:58Z

Hi @echarlaix . Thanks for your advice, it perfectly fixed my problem. Would you please review these changes? Thx!

And the failed CIs are not related to my changes

Added updates in jiqing-feng#2, can you take a look ?

echarlaix · 2024-01-05T10:32:32Z

Also could you add a test before we can merge ?

add input names

jiqing-feng · 2024-01-08T03:10:06Z

Hi @echarlaix . I have merged your changes and also added the tests. Would you please help to review the test function? Thx!

BTW, failed CIs seem not related to our changes.

optimum/intel/generation/modeling.py

* add position_ids in forward * check if jit model need position_ids * use MODEL_TYPES_REQUIRING_POSITION_IDS * fix has_position_ids * fix position_ids length * rm useless params * check model inputs by input names * fix format * check input names in graph model * fix style * consider eager model in input_names * add input names * add text input names * fix styl;e * Update optimum/intel/generation/modeling.py * fix format * Update optimum/intel/generation/modeling.py --------- Co-authored-by: Ella Charlaix <ella@huggingface.co> Co-authored-by: Ella Charlaix <80481427+echarlaix@users.noreply.github.com>

add position_ids in forward

e3f87a7

check if jit model need position_ids

9a00f0c

echarlaix reviewed Oct 17, 2023

View reviewed changes

jiqing-feng added 6 commits October 17, 2023 19:40

use MODEL_TYPES_REQUIRING_POSITION_IDS

02b1bf8

Merge branch 'origin_main' into position_ids

d20de9c

merge main

bad9d4a

fix has_position_ids

f187713

fix position_ids length

9a3b400

rm useless params

a8d7238

echarlaix reviewed Dec 12, 2023

View reviewed changes

Merge branch 'huggingface:main' into position_ids

1fdfac2

jiqing-feng added 2 commits December 19, 2023 21:31

check model inputs by input names

b6dfc08

fix format

339739c

jiqing-feng added 3 commits January 4, 2024 20:49

check input names in graph model

38a80b9

fix style

dbf4a2f

consider eager model in input_names

56121f3

add input names

9021787

Merge pull request #2 from huggingface/position_ids_inc

fca4990

add input names

add text input names

3cd96f1

fix styl;e

daa0489

echarlaix reviewed Jan 8, 2024

View reviewed changes

optimum/intel/generation/modeling.py Outdated Show resolved Hide resolved

Update optimum/intel/generation/modeling.py

337a653

echarlaix reviewed Jan 8, 2024

View reviewed changes

optimum/intel/generation/modeling.py Show resolved Hide resolved

fix format

4410067

echarlaix reviewed Jan 8, 2024

View reviewed changes

optimum/intel/generation/modeling.py Outdated Show resolved Hide resolved

Update optimum/intel/generation/modeling.py

34ef1c0

echarlaix merged commit c64025d into huggingface:main Jan 8, 2024
9 of 10 checks passed

jiqing-feng deleted the position_ids branch January 8, 2024 14:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add position_ids in forward #456

add position_ids in forward #456

jiqing-feng commented Oct 17, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Oct 17, 2023

echarlaix left a comment

echarlaix Oct 17, 2023

echarlaix Oct 17, 2023

jiqing-feng Oct 18, 2023

echarlaix Oct 17, 2023

jiqing-feng Oct 18, 2023

echarlaix Oct 20, 2023 •

edited

Loading

jiqing-feng commented Nov 15, 2023

jiqing-feng commented Nov 20, 2023

jiqing-feng commented Dec 12, 2023

echarlaix Dec 12, 2023

echarlaix Dec 12, 2023

jiqing-feng Dec 13, 2023

echarlaix Dec 13, 2023

jiqing-feng Dec 13, 2023

echarlaix Dec 12, 2023

jiqing-feng commented Dec 14, 2023 •

edited

Loading

echarlaix commented Dec 14, 2023

jiqing-feng commented Dec 20, 2023 •

edited

Loading

jiqing-feng commented Dec 22, 2023 •

edited

Loading

echarlaix commented Jan 4, 2024

jiqing-feng commented Jan 5, 2024 •

edited

Loading

echarlaix commented Jan 5, 2024

echarlaix commented Jan 5, 2024

jiqing-feng commented Jan 8, 2024 •

edited

Loading

	if "attention_mask" in self.input_names or "position_ids" in self.input_names:
	if attention_mask is not None:
	attention_mask = np.array(attention_mask)
	else:
	attention_mask = np.ones(
	(input_ids.shape[0], input_ids.shape[1] + past_len), dtype=inputs["input_ids"].dtype
	)

	if "attention_mask" in self.input_names:
	inputs["attention_mask"] = attention_mask

	if "position_ids" in self.input_names:
	if position_ids is not None:
	position_ids = np.array(position_ids)
	else:
	position_ids = np.cumsum(attention_mask, axis=1) - 1
	position_ids[attention_mask == 0] = 1
	if past_key_values:
	position_ids = np.expand_dims(position_ids[:, -1], axis=-1)

	has_position_ids = True if model_type in MODEL_TYPES_REQUIRING_POSITION_IDS else False
	has_position_ids = model_type in MODEL_TYPES_REQUIRING_POSITION_IDS

add position_ids in forward #456

add position_ids in forward #456

Conversation

jiqing-feng commented Oct 17, 2023 • edited Loading

HuggingFaceDocBuilderDev commented Oct 17, 2023

echarlaix left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

echarlaix Oct 20, 2023 • edited Loading

Choose a reason for hiding this comment

jiqing-feng commented Nov 15, 2023

jiqing-feng commented Nov 20, 2023

jiqing-feng commented Dec 12, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jiqing-feng commented Dec 14, 2023 • edited Loading

echarlaix commented Dec 14, 2023

jiqing-feng commented Dec 20, 2023 • edited Loading

jiqing-feng commented Dec 22, 2023 • edited Loading

echarlaix commented Jan 4, 2024

jiqing-feng commented Jan 5, 2024 • edited Loading

echarlaix commented Jan 5, 2024

echarlaix commented Jan 5, 2024

jiqing-feng commented Jan 8, 2024 • edited Loading

jiqing-feng commented Oct 17, 2023 •

edited

Loading

echarlaix Oct 20, 2023 •

edited

Loading

jiqing-feng commented Dec 14, 2023 •

edited

Loading

jiqing-feng commented Dec 20, 2023 •

edited

Loading

jiqing-feng commented Dec 22, 2023 •

edited

Loading

jiqing-feng commented Jan 5, 2024 •

edited

Loading

jiqing-feng commented Jan 8, 2024 •

edited

Loading