Complete documentation for all pyqog functions.
The main function of pyqog. Downloads a QoG dataset and returns it as a
pandas DataFrame. Uses local caching by default.
pyqog.read_qog(
which_data: str = "basic",
data_type: str = "time-series",
year: int = 2026,
data_dir: str | None = None,
cache: bool = True,
update_cache: bool = False
) -> pd.DataFrame
| Parameter | Type | Default | Description |
|---|---|---|---|
| which_data | str | "basic" |
Dataset to download. Options: "basic", "standard",
"oecd", "environmental", "social_policy".
|
| data_type | str | "time-series" |
Data format. Options: "time-series" (panel data with country-year rows)
or "cross-sectional" (one row per country).
|
| year | int | 2026 |
Publication year of the dataset version. Not the year of the data.
For example, 2026 downloads the Jan 2026 release. Older years are
fetched from the QoG data archive.
|
| data_dir | str | None | None |
Directory for caching files. If None, uses the default
~/.pyqog/cache/. Overrides the default cache location.
|
| cache | bool | True |
Whether to use local caching. If True, checks for a cached file
before downloading. If False, always downloads from the server
(but does not save locally).
|
| update_cache | bool | False | Force re-download even if a cached file exists. The new file replaces the old cached version. |
pd.DataFrame — A pandas DataFrame containing the requested QoG dataset.
| Exception | Condition |
|---|---|
ValueError |
Invalid which_data, data_type, or year value. |
requests.ConnectionError |
No internet connection and no cached file available. |
requests.HTTPError |
Server returned an error (e.g., 404 for unavailable dataset). |
import pyqog
# Basic time-series (default)
df = pyqog.read_qog()
# Standard cross-sectional
df = pyqog.read_qog(which_data="standard", data_type="cross-sectional")
# OECD dataset from 2022
df = pyqog.read_qog(which_data="oecd", year=2022)
# Force re-download
df = pyqog.read_qog(update_cache=True)
# Custom cache directory
df = pyqog.read_qog(data_dir="/tmp/qog_data")
Lists all available QoG datasets with their descriptions.
pyqog.list_datasets() -> pd.DataFrame
None.
pd.DataFrame — A DataFrame with columns for dataset name, prefix,
and description.
import pyqog
datasets = pyqog.list_datasets()
print(datasets)
# name prefix description
# 0 basic bas Key governance indicators
# 1 standard std Comprehensive dataset (~2000 vars)
# 2 oecd oecd OECD member countries
# 3 environmental ei Environmental indicators
# 4 social_policy soc Social policy data
Lists available publication years for a given dataset.
pyqog.list_versions(
which_data: str = "basic"
) -> list[int]
| Parameter | Type | Default | Description |
|---|---|---|---|
| which_data | str | "basic" |
Dataset name. Options: "basic", "standard",
"oecd", "environmental", "social_policy".
|
list[int] — A list of available publication years, sorted in descending order.
import pyqog
versions = pyqog.list_versions("standard")
print(versions)
# [2026, 2025, 2024, 2023, 2022, 2021, 2020, 2019, 2018, ...]
Returns the URL for a dataset's codebook PDF.
pyqog.get_codebook_url(
which_data: str = "basic",
year: int = 2026
) -> str
| Parameter | Type | Default | Description |
|---|---|---|---|
| which_data | str | "basic" |
Dataset name. Options: "basic", "standard",
"oecd", "environmental", "social_policy".
|
| year | int | 2026 | Publication year of the dataset version. |
str — URL to the codebook PDF file.
import pyqog
# Current version codebook
url = pyqog.get_codebook_url("standard", 2026)
print(url)
# https://www.qogdata.pol.gu.se/data/codebook_std_jan26.pdf
# Archived version codebook
url = pyqog.get_codebook_url("basic", 2020)
print(url)
# https://www.qogdata.pol.gu.se/dataarchive/codebook_bas_jan20.pdf
Searches for column names in a DataFrame that match a given pattern. Useful for finding variables in large QoG datasets.
pyqog.search_variables(
df: pd.DataFrame,
pattern: str
) -> list[str]
| Parameter | Type | Default | Description |
|---|---|---|---|
| df | pd.DataFrame | — | A pandas DataFrame (typically returned by read_qog()). |
| pattern | str | — | Search pattern (case-insensitive). Matches any part of the column name. |
list[str] — A list of column names that match the pattern.
import pyqog
df = pyqog.read_qog(which_data="standard")
# Find corruption-related variables
corruption_vars = pyqog.search_variables(df, "corrupt")
print(corruption_vars)
# Find GDP-related variables
gdp_vars = pyqog.search_variables(df, "gdp")
print(gdp_vars)
# Find democracy-related variables
demo_vars = pyqog.search_variables(df, "demo")
print(demo_vars)
Returns summary information about a QoG dataset, including number of variables, countries, years, and more.
pyqog.describe_dataset(
which_data: str = "basic",
year: int = 2026
) -> dict
| Parameter | Type | Default | Description |
|---|---|---|---|
| which_data | str | "basic" |
Dataset name. Options: "basic", "standard",
"oecd", "environmental", "social_policy".
|
| year | int | 2026 | Publication year of the dataset version. |
dict — A dictionary with summary information:
| Key | Type | Description |
|---|---|---|
"dataset" |
str | Dataset name |
"version" |
int | Publication year |
"n_rows" |
int | Number of rows |
"n_vars" |
int | Number of variables (columns) |
"n_countries" |
int | Number of unique countries |
"n_years" |
int | None | Number of unique years (time-series only) |
"codebook_url" |
str | URL to the codebook PDF |
import pyqog
info = pyqog.describe_dataset("standard", 2026)
print(info)
# {
# "dataset": "standard",
# "version": 2026,
# "n_rows": 15000,
# "n_vars": 2100,
# "n_countries": 195,
# "n_years": 75,
# "codebook_url": "https://www.qogdata.pol.gu.se/data/codebook_std_jan26.pdf"
# }