Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pp_force_state cannot work on my machine. #96

Open
andyzhanged opened this issue Jul 30, 2020 · 3 comments
Open

pp_force_state cannot work on my machine. #96

andyzhanged opened this issue Jul 30, 2020 · 3 comments

Comments

@andyzhanged
Copy link

hi,my gpu card is vega20 and my machine info as following

zhanged@dcu:~$ lsb_release
No LSB modules are available.
zhanged@dcu:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 18.04.4 LTS
Release:        18.04
Codename:       bionic
zhanged@dcu:~$ cat /proc/version
Linux version 5.0.0-23-generic (buildd@lgw01-amd64-030) (gcc version 7.4.0 (Ubuntu 7.4.0-1ubuntu1~18.04.1)) #24~18.04.1-Ubuntu SMP Mon Jul 29 16:12:28 UTC 2019
zhanged@dcu:~$ dkms status
amdgpu, 3.3-19, 5.0.0-23-generic, x86_64: installed
zhanged@dcu:~$ zhanged@dcu:~$ lsb_release
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 18.04.4 LTS
Release:        18.04
Codename:       bionic
zhanged@dcu:~$ cat /proc/version
Linux version 5.0.0-23-generic (buildd@lgw01-amd64-030) (gcc version 7.4.0 (Ubuntu 7.4.0-1ubuntu1~18.04.1)) #24~18.04.1-Ubuntu SMP Mon Jul 29 16:12:28 UTC 2019

When i try to set pp_state by sysfs pp_force_state, i find it do not work. I debug the code, find amdgpu_dpm_get_pp_num_states(adev, &data) return -22(maybe my card not support), but the code still go on instead of return before. so i think we should return there and report fail to user.

static ssize_t amdgpu_set_pp_force_state(struct device *dev,
		struct device_attribute *attr,
		const char *buf,
		size_t count)
{
	struct drm_device *ddev = dev_get_drvdata(dev);
	struct amdgpu_device *adev = ddev->dev_private;
	enum amd_pm_state_type state = 0;
	unsigned long idx;
	int ret;

	if (amdgpu_sriov_vf(adev) && !amdgpu_sriov_is_pp_one_vf(adev))
		return -EINVAL;

	if (strlen(buf) == 1)
		adev->pp_force_state_enabled = false;
	else if (is_support_sw_smu(adev))
		adev->pp_force_state_enabled = false;
	else if (adev->powerplay.pp_funcs->dispatch_tasks &&
			adev->powerplay.pp_funcs->get_pp_num_states) {
		struct pp_states_info data;

		ret = kstrtoul(buf, 0, &idx);
		if (ret || idx >= ARRAY_SIZE(data.states))
			return -EINVAL;

		idx = array_index_nospec(idx, ARRAY_SIZE(data.states));

		amdgpu_dpm_get_pp_num_states(adev, &data);
		state = data.states[idx];

		ret = pm_runtime_get_sync(ddev->dev);
		if (ret < 0)
			return ret;

		/* only set user selected power states */
		if (state != POWER_STATE_TYPE_INTERNAL_BOOT &&
		    state != POWER_STATE_TYPE_DEFAULT) {
			amdgpu_dpm_dispatch_task(adev,
					AMD_PP_TASK_ENABLE_USER_STATE, &state);
			adev->pp_force_state_enabled = true;
		}
		pm_runtime_mark_last_busy(ddev->dev);
		pm_runtime_put_autosuspend(ddev->dev);
	}

	return count;
}
@fxkamd
Copy link
Contributor

fxkamd commented Jul 30, 2020

Looks like powerplay is not enabled on your GPU. Can you post a full dmesg log?

@andyzhanged
Copy link
Author

powerplay is enabled. Other powerplay related sysfs power_dpm_force_performance_level,pp_power_profile_mode, pp_dpm_dcefclk works well. The pp_dpm_get_pp_num_states will return at line "if (!hwmgr || !hwmgr->pm_en ||!hwmgr->ps)" .

static int pp_dpm_get_pp_num_states(void *handle,
		struct pp_states_info *data)
{
	struct pp_hwmgr *hwmgr = handle;
	int i;

	memset(data, 0, sizeof(*data));

	if (!hwmgr || !hwmgr->pm_en ||!hwmgr->ps)
		return -EINVAL;

	mutex_lock(&hwmgr->smu_lock);

	data->nums = hwmgr->num_ps;

The reason it that api psm_init_power_state_table will return cause hwmgr->hwmgr_func->get_num_of_pp_table_entries == NULL. So some member of hwmgr is not initialized.

int psm_init_power_state_table(struct pp_hwmgr *hwmgr)
{
	int result;
	unsigned int i;
	unsigned int table_entries;
	struct pp_power_state *state;
	int size;

	if (hwmgr->hwmgr_func->get_num_of_pp_table_entries == NULL)
		return 0;

	if (hwmgr->hwmgr_func->get_power_state_size == NULL)
		return 0;

	hwmgr->num_ps = table_entries = hwmgr->hwmgr_func->get_num_of_pp_table_entries(hwmgr);

	hwmgr->ps_size = size = hwmgr->hwmgr_func->get_power_state_size(hwmgr) +
					  sizeof(struct pp_power_state);

So maybe i guess vage20 is not support by now.

@ppanchad-amd
Copy link

@andyzhanged Apologies for the lack of response. Can you please check if your issue still exist with the latest ROCm 6.2? If not, please close the ticket. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants