read_qog()

The main function of pyqog. Downloads a QoG dataset and returns it as a pandas DataFrame. Uses local caching by default.

Signature
pyqog.read_qog(
    which_data: str = "basic",
    data_type: str = "time-series",
    year: int = 2026,
    data_dir: str | None = None,
    cache: bool = True,
    update_cache: bool = False
) -> pd.DataFrame
Parameters
Parameter Type Default Description
which_data str "basic" Dataset to download. Options: "basic", "standard", "oecd", "environmental", "social_policy".
data_type str "time-series" Data format. Options: "time-series" (panel data with country-year rows) or "cross-sectional" (one row per country).
year int 2026 Publication year of the dataset version. Not the year of the data. For example, 2026 downloads the Jan 2026 release. Older years are fetched from the QoG data archive.
data_dir str | None None Directory for caching files. If None, uses the default ~/.pyqog/cache/. Overrides the default cache location.
cache bool True Whether to use local caching. If True, checks for a cached file before downloading. If False, always downloads from the server (but does not save locally).
update_cache bool False Force re-download even if a cached file exists. The new file replaces the old cached version.
Returns

pd.DataFrame — A pandas DataFrame containing the requested QoG dataset.

Raises
Exception Condition
ValueError Invalid which_data, data_type, or year value.
requests.ConnectionError No internet connection and no cached file available.
requests.HTTPError Server returned an error (e.g., 404 for unavailable dataset).
Examples
import pyqog

# Basic time-series (default)
df = pyqog.read_qog()

# Standard cross-sectional
df = pyqog.read_qog(which_data="standard", data_type="cross-sectional")

# OECD dataset from 2022
df = pyqog.read_qog(which_data="oecd", year=2022)

# Force re-download
df = pyqog.read_qog(update_cache=True)

# Custom cache directory
df = pyqog.read_qog(data_dir="/tmp/qog_data")

list_datasets()

Lists all available QoG datasets with their descriptions.

Signature
pyqog.list_datasets() -> pd.DataFrame
Parameters

None.

Returns

pd.DataFrame — A DataFrame with columns for dataset name, prefix, and description.

Example
import pyqog

datasets = pyqog.list_datasets()
print(datasets)

#         name prefix                          description
# 0      basic    bas    Key governance indicators
# 1   standard    std    Comprehensive dataset (~2000 vars)
# 2       oecd   oecd    OECD member countries
# 3  environmental  ei    Environmental indicators
# 4  social_policy  soc   Social policy data

list_versions()

Lists available publication years for a given dataset.

Signature
pyqog.list_versions(
    which_data: str = "basic"
) -> list[int]
Parameters
Parameter Type Default Description
which_data str "basic" Dataset name. Options: "basic", "standard", "oecd", "environmental", "social_policy".
Returns

list[int] — A list of available publication years, sorted in descending order.

Example
import pyqog

versions = pyqog.list_versions("standard")
print(versions)
# [2026, 2025, 2024, 2023, 2022, 2021, 2020, 2019, 2018, ...]

get_codebook_url()

Returns the URL for a dataset's codebook PDF.

Signature
pyqog.get_codebook_url(
    which_data: str = "basic",
    year: int = 2026
) -> str
Parameters
Parameter Type Default Description
which_data str "basic" Dataset name. Options: "basic", "standard", "oecd", "environmental", "social_policy".
year int 2026 Publication year of the dataset version.
Returns

str — URL to the codebook PDF file.

Example
import pyqog

# Current version codebook
url = pyqog.get_codebook_url("standard", 2026)
print(url)
# https://www.qogdata.pol.gu.se/data/codebook_std_jan26.pdf

# Archived version codebook
url = pyqog.get_codebook_url("basic", 2020)
print(url)
# https://www.qogdata.pol.gu.se/dataarchive/codebook_bas_jan20.pdf

search_variables()

Searches for column names in a DataFrame that match a given pattern. Useful for finding variables in large QoG datasets.

Signature
pyqog.search_variables(
    df: pd.DataFrame,
    pattern: str
) -> list[str]
Parameters
Parameter Type Default Description
df pd.DataFrame A pandas DataFrame (typically returned by read_qog()).
pattern str Search pattern (case-insensitive). Matches any part of the column name.
Returns

list[str] — A list of column names that match the pattern.

Example
import pyqog

df = pyqog.read_qog(which_data="standard")

# Find corruption-related variables
corruption_vars = pyqog.search_variables(df, "corrupt")
print(corruption_vars)

# Find GDP-related variables
gdp_vars = pyqog.search_variables(df, "gdp")
print(gdp_vars)

# Find democracy-related variables
demo_vars = pyqog.search_variables(df, "demo")
print(demo_vars)

describe_dataset()

Returns summary information about a QoG dataset, including number of variables, countries, years, and more.

Signature
pyqog.describe_dataset(
    which_data: str = "basic",
    year: int = 2026
) -> dict
Parameters
Parameter Type Default Description
which_data str "basic" Dataset name. Options: "basic", "standard", "oecd", "environmental", "social_policy".
year int 2026 Publication year of the dataset version.
Returns

dict — A dictionary with summary information:

Key Type Description
"dataset" str Dataset name
"version" int Publication year
"n_rows" int Number of rows
"n_vars" int Number of variables (columns)
"n_countries" int Number of unique countries
"n_years" int | None Number of unique years (time-series only)
"codebook_url" str URL to the codebook PDF
Example
import pyqog

info = pyqog.describe_dataset("standard", 2026)
print(info)
# {
#     "dataset": "standard",
#     "version": 2026,
#     "n_rows": 15000,
#     "n_vars": 2100,
#     "n_countries": 195,
#     "n_years": 75,
#     "codebook_url": "https://www.qogdata.pol.gu.se/data/codebook_std_jan26.pdf"
# }