API Reference

read_qog()

The main function of pyqog. Downloads a QoG dataset and returns it as a pandas DataFrame. Uses local caching by default.

Signature

pyqog.read_qog(
    which_data: str = "basic",
    data_type: str = "time-series",
    year: int = 2026,
    data_dir: str | None = None,
    cache: bool = True,
    update_cache: bool = False
) -> pd.DataFrame

Parameters

Parameter	Type	Default	Description
which_data	str	"basic"	Dataset to download. Options: `"basic"`, `"standard"`, `"oecd"`, `"environmental"`, `"social_policy"`.
data_type	str	"time-series"	Data format. Options: `"time-series"` (panel data with country-year rows) or `"cross-sectional"` (one row per country).
year	int	2026	Publication year of the dataset version. Not the year of the data. For example, `2026` downloads the Jan 2026 release. Older years are fetched from the QoG data archive.
data_dir	str \| None	None	Directory for caching files. If `None`, uses the default `~/.pyqog/cache/`. Overrides the default cache location.
cache	bool	True	Whether to use local caching. If `True`, checks for a cached file before downloading. If `False`, always downloads from the server (but does not save locally).
update_cache	bool	False	Force re-download even if a cached file exists. The new file replaces the old cached version.

Returns

pd.DataFrame — A pandas DataFrame containing the requested QoG dataset.

Raises

Exception	Condition
`ValueError`	Invalid `which_data`, `data_type`, or `year` value.
`requests.ConnectionError`	No internet connection and no cached file available.
`requests.HTTPError`	Server returned an error (e.g., 404 for unavailable dataset).

Examples

import pyqog

# Basic time-series (default)
df = pyqog.read_qog()

# Standard cross-sectional
df = pyqog.read_qog(which_data="standard", data_type="cross-sectional")

# OECD dataset from 2022
df = pyqog.read_qog(which_data="oecd", year=2022)

# Force re-download
df = pyqog.read_qog(update_cache=True)

# Custom cache directory
df = pyqog.read_qog(data_dir="/tmp/qog_data")

list_datasets()

Lists all available QoG datasets with their descriptions.

Signature

pyqog.list_datasets() -> pd.DataFrame

Parameters

None.

Returns

pd.DataFrame — A DataFrame with columns for dataset name, prefix, and description.

Example

import pyqog

datasets = pyqog.list_datasets()
print(datasets)

#         name prefix                          description
# 0      basic    bas    Key governance indicators
# 1   standard    std    Comprehensive dataset (~2000 vars)
# 2       oecd   oecd    OECD member countries
# 3  environmental  ei    Environmental indicators
# 4  social_policy  soc   Social policy data

list_versions()

Lists available publication years for a given dataset.

Signature

pyqog.list_versions(
    which_data: str = "basic"
) -> list[int]

Parameters

Parameter	Type	Default	Description
which_data	str	"basic"	Dataset name. Options: `"basic"`, `"standard"`, `"oecd"`, `"environmental"`, `"social_policy"`.

Returns

list[int] — A list of available publication years, sorted in descending order.

Example

import pyqog

versions = pyqog.list_versions("standard")
print(versions)
# [2026, 2025, 2024, 2023, 2022, 2021, 2020, 2019, 2018, ...]

get_codebook_url()

Returns the URL for a dataset's codebook PDF.

Signature

pyqog.get_codebook_url(
    which_data: str = "basic",
    year: int = 2026
) -> str

Parameters

Parameter	Type	Default	Description
which_data	str	"basic"	Dataset name. Options: `"basic"`, `"standard"`, `"oecd"`, `"environmental"`, `"social_policy"`.
year	int	2026	Publication year of the dataset version.

Returns

str — URL to the codebook PDF file.

Example

import pyqog

# Current version codebook
url = pyqog.get_codebook_url("standard", 2026)
print(url)
# https://www.qogdata.pol.gu.se/data/codebook_std_jan26.pdf

# Archived version codebook
url = pyqog.get_codebook_url("basic", 2020)
print(url)
# https://www.qogdata.pol.gu.se/dataarchive/codebook_bas_jan20.pdf

search_variables()

Searches for column names in a DataFrame that match a given pattern. Useful for finding variables in large QoG datasets.

Signature

pyqog.search_variables(
    df: pd.DataFrame,
    pattern: str
) -> list[str]

Parameters

Parameter	Type	Default	Description
df	pd.DataFrame	—	A pandas DataFrame (typically returned by `read_qog()`).
pattern	str	—	Search pattern (case-insensitive). Matches any part of the column name.

Returns

list[str] — A list of column names that match the pattern.

Example

import pyqog

df = pyqog.read_qog(which_data="standard")

# Find corruption-related variables
corruption_vars = pyqog.search_variables(df, "corrupt")
print(corruption_vars)

# Find GDP-related variables
gdp_vars = pyqog.search_variables(df, "gdp")
print(gdp_vars)

# Find democracy-related variables
demo_vars = pyqog.search_variables(df, "demo")
print(demo_vars)

describe_dataset()

Returns summary information about a QoG dataset, including number of variables, countries, years, and more.

Signature

pyqog.describe_dataset(
    which_data: str = "basic",
    year: int = 2026
) -> dict

Parameters

Parameter	Type	Default	Description
which_data	str	"basic"	Dataset name. Options: `"basic"`, `"standard"`, `"oecd"`, `"environmental"`, `"social_policy"`.
year	int	2026	Publication year of the dataset version.

Returns

dict — A dictionary with summary information:

Key	Type	Description
`"dataset"`	str	Dataset name
`"version"`	int	Publication year
`"n_rows"`	int	Number of rows
`"n_vars"`	int	Number of variables (columns)
`"n_countries"`	int	Number of unique countries
`"n_years"`	int \| None	Number of unique years (time-series only)
`"codebook_url"`	str	URL to the codebook PDF

Example

import pyqog

info = pyqog.describe_dataset("standard", 2026)
print(info)
# {
#     "dataset": "standard",
#     "version": 2026,
#     "n_rows": 15000,
#     "n_vars": 2100,
#     "n_countries": 195,
#     "n_years": 75,
#     "codebook_url": "https://www.qogdata.pol.gu.se/data/codebook_std_jan26.pdf"
# }