does not return hidden states

#15

by wassname - opened Dec 14, 2023

Discussion

wassname

Dec 14, 2023

Phi overrides the pretrained transformer but does not extend it's capability for returning hidden states

wassname

Dec 15, 2023

•

edited Dec 15, 2023

here's a modified version that returns attention and hidden states https://huggingface.co/wassname/phi-2-GPTQ_w_hidden_states/blob/main/configuration_phi.py

edmond

Dec 17, 2023

@wassname is there any plan to really change phi-2 ?
Because the following warning remains on the main page :
"Remark: In the generation function, our model currently does not support beam search (num_beams > 1). Furthermore, in the forward pass of the model, we currently do not support outputting hidden states or attention values, or using custom input embeddings."
I personally use a lot custom input embeddings and this makes phi unusable for many usecases in my opinion.

gugarosa

Microsoft org Dec 20, 2023

Hello @edmond and @wassname !

This will be updated once we integrate with the Phi implementation in HF.

Best regards,
Gustavo.

gugarosa changed discussion status to closed Dec 20, 2023

wassname

Dec 21, 2023

Thanks Gustov, much appreciated. Phi -2 is an awesome model for research as it fits on consumer gpu's even when doing strange experiments (VAE, Adaptors, Probing).