How to Consume OpenEV Data¶
This guide explains how to download and use the OpenEV Data dataset in various formats.
Available Formats¶
The dataset is available in multiple formats to suit different use cases:
| Format | Size | Best For | File Extension |
|---|---|---|---|
| JSON | ~46 KB | APIs, Web Apps, JavaScript | .json |
| CSV | ~4 KB | Spreadsheets, Analytics, Excel | .csv |
| PostgreSQL | ~37 KB | Production Databases, Complex Queries | .sql |
| SQLite | ~128 KB | Mobile Apps, Desktop Apps, Embedded | .db |
| XML | ~14 KB | Enterprise Systems, Legacy Integration | .xml |
Downloading the Dataset¶
Latest Release¶
Download the latest version directly from GitHub:
LATEST_TAG=$(curl -s https://api.github.com/repos/open-ev-data/open-ev-data-dataset/releases/latest | jq -r .tag_name)
curl -L -o open-ev-data.json "https://github.com/open-ev-data/open-ev-data-dataset/releases/download/$LATEST_TAG/open-ev-data-$LATEST_TAG.json"
curl -L -o open-ev-data.csv "https://github.com/open-ev-data/open-ev-data-dataset/releases/download/$LATEST_TAG/open-ev-data-$LATEST_TAG.csv"
curl -L -o open-ev-data.sql "https://github.com/open-ev-data/open-ev-data-dataset/releases/download/$LATEST_TAG/open-ev-data-$LATEST_TAG.sql"
curl -L -o open-ev-data.db "https://github.com/open-ev-data/open-ev-data-dataset/releases/download/$LATEST_TAG/open-ev-data-$LATEST_TAG.db"
curl -L -o open-ev-data.xml "https://github.com/open-ev-data/open-ev-data-dataset/releases/download/$LATEST_TAG/open-ev-data-$LATEST_TAG.xml"
Specific Version¶
Replace v1.0.0 with your desired version:
VERSION="v1.0.0"
curl -L -o open-ev-data.json "https://github.com/open-ev-data/open-ev-data-dataset/releases/download/$VERSION/open-ev-data-$VERSION.json"
Using JSON Format¶
Structure¶
{
"version": "1.0.0",
"generated_at": "2025-01-01T00:00:00Z",
"vehicles": [
{
"brand": "BYD",
"model": "Dolphin",
"variant": "Standard Range",
"year": 2024,
"battery": {
"capacity_kwh": 44.9
},
"range": {
"wltp": {
"km": 340,
"miles": 211
}
},
"sources": [
{
"country": "Brazil",
"name": "Official Website",
"url": "https://..."
}
]
}
]
}
Examples¶
JavaScript/Node.js¶
const fs = require('fs');
const data = JSON.parse(fs.readFileSync('open-ev-data.json', 'utf8'));
console.log(`Dataset version: ${data.version}`);
console.log(`Total vehicles: ${data.vehicles.length}`);
const bydVehicles = data.vehicles.filter(v => v.brand === 'BYD');
console.log(`BYD vehicles: ${bydVehicles.length}`);
Python¶
import json
import requests
url = "https://github.com/open-ev-data/open-ev-data-dataset/releases/latest/download/open-ev-data.json"
response = requests.get(url)
data = response.json()
print(f"Dataset version: {data['version']}")
print(f"Total vehicles: {len(data['vehicles'])}")
byd_vehicles = [v for v in data['vehicles'] if v['brand'] == 'BYD']
print(f"BYD vehicles: {len(byd_vehicles)}")
for vehicle in byd_vehicles:
print(f"- {vehicle['brand']} {vehicle['model']} ({vehicle['year']})")
curl + jq¶
curl -sL "https://github.com/open-ev-data/open-ev-data-dataset/releases/latest/download/open-ev-data.json" | \
jq '.vehicles[] | select(.brand == "BYD") | "\(.brand) \(.model) \(.year)"'
Using CSV Format¶
Structure¶
brand,model,variant,year,battery_capacity_kwh,range_km,range_miles,charging_speed_kw,acceleration_0_100_kmh,top_speed_kmh,drive_type,seats,country,source_name,source_url
BYD,Dolphin,Standard Range,2024,44.9,340,211,60,10.5,160,FWD,5,Brazil,Official Website,https://...
Examples¶
Excel/Google Sheets¶
- Download the CSV file
- Open in Excel or Google Sheets
- Use built-in filters and pivot tables
Python/Pandas¶
import pandas as pd
url = "https://github.com/open-ev-data/open-ev-data-dataset/releases/latest/download/open-ev-data.csv"
df = pd.read_csv(url)
print(df.describe())
avg_range = df.groupby('brand')['range_km'].mean()
print(avg_range)
df_2024 = df[df['year'] == 2024]
print(f"2024 vehicles: {len(df_2024)}")
R¶
library(tidyverse)
url <- "https://github.com/open-ev-data/open-ev-data-dataset/releases/latest/download/open-ev-data.csv"
data <- read_csv(url)
summary(data)
data %>%
group_by(brand) %>%
summarise(avg_range = mean(range_km, na.rm = TRUE)) %>%
arrange(desc(avg_range))
Using PostgreSQL Format¶
Import into Database¶
Docker PostgreSQL¶
Use the pre-populated Docker image:
docker run -d \
--name openev-postgres \
-p 5432:5432 \
-e POSTGRES_PASSWORD=openevdata \
ghcr.io/open-ev-data/open-ev-data-postgres:latest
Connect:
Query Examples¶
SELECT brand, model, year, battery_capacity_kwh
FROM vehicles
WHERE brand = 'BYD'
ORDER BY year DESC;
SELECT brand, COUNT(*) as total_models
FROM vehicles
GROUP BY brand
ORDER BY total_models DESC;
SELECT v.brand, v.model, v.year, s.country
FROM vehicles v
JOIN sources s ON v.id = s.vehicle_id
WHERE s.country = 'Brazil';
SELECT brand, AVG(range_wltp_km) as avg_range
FROM vehicles
WHERE range_wltp_km IS NOT NULL
GROUP BY brand
HAVING COUNT(*) > 1
ORDER BY avg_range DESC;
Using SQLite Format¶
Query from Command Line¶
Python¶
import sqlite3
conn = sqlite3.connect('open-ev-data.db')
cursor = conn.cursor()
cursor.execute("SELECT brand, COUNT(*) FROM vehicles GROUP BY brand")
for row in cursor.fetchall():
print(f"{row[0]}: {row[1]} models")
conn.close()
Node.js¶
const sqlite3 = require('sqlite3').verbose();
const db = new sqlite3.Database('open-ev-data.db');
db.all("SELECT brand, model, year FROM vehicles WHERE brand = ?", ['BYD'], (err, rows) => {
if (err) throw err;
rows.forEach(row => {
console.log(`${row.brand} ${row.model} (${row.year})`);
});
});
db.close();
Using XML Format¶
Structure¶
<?xml version="1.0" encoding="UTF-8"?>
<dataset version="1.0.0" generated_at="2025-01-01T00:00:00Z">
<vehicles>
<vehicle>
<brand>BYD</brand>
<model>Dolphin</model>
<year>2024</year>
<battery capacity_kwh="44.9"/>
<range>
<wltp km="340" miles="211"/>
</range>
</vehicle>
</vehicles>
</dataset>
Examples¶
Python¶
import xml.etree.ElementTree as ET
import requests
url = "https://github.com/open-ev-data/open-ev-data-dataset/releases/latest/download/open-ev-data.xml"
response = requests.get(url)
root = ET.fromstring(response.content)
for vehicle in root.findall('.//vehicle'):
brand = vehicle.find('brand').text
model = vehicle.find('model').text
year = vehicle.find('year').text
print(f"{brand} {model} ({year})")
Java¶
import javax.xml.parsers.*;
import org.w3c.dom.*;
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse("open-ev-data.xml");
NodeList vehicles = doc.getElementsByTagName("vehicle");
for (int i = 0; i < vehicles.getLength(); i++) {
Element vehicle = (Element) vehicles.item(i);
String brand = vehicle.getElementsByTagName("brand").item(0).getTextContent();
String model = vehicle.getElementsByTagName("model").item(0).getTextContent();
System.out.println(brand + " " + model);
}
Full Stack with Docker Compose¶
See the docker-compose.yml in the repository root to run the complete stack with API and database.
This starts: - PostgreSQL database with pre-populated data (port 5432) - OpenEV Data API (port 8080)
Access the API:
See API Documentation for more details.
Integration Examples¶
REST API Endpoint¶
app.get('/api/vehicles', async (req, res) => {
const response = await fetch('https://github.com/open-ev-data/open-ev-data-dataset/releases/latest/download/open-ev-data.json');
const data = await response.json();
const { brand, year } = req.query;
let vehicles = data.vehicles;
if (brand) vehicles = vehicles.filter(v => v.brand === brand);
if (year) vehicles = vehicles.filter(v => v.year === parseInt(year));
res.json(vehicles);
});
Caching Strategy¶
const NodeCache = require('node-cache');
const cache = new NodeCache({ stdTTL: 3600 });
async function getVehicles() {
const cached = cache.get('vehicles');
if (cached) return cached;
const response = await fetch('https://github.com/open-ev-data/open-ev-data-dataset/releases/latest/download/open-ev-data.json');
const data = await response.json();
cache.set('vehicles', data.vehicles);
return data.vehicles;
}
Automated Updates¶
#!/bin/bash
LATEST_TAG=$(curl -s https://api.github.com/repos/open-ev-data/open-ev-data-dataset/releases/latest | jq -r .tag_name)
CURRENT_TAG=$(cat .current_version 2>/dev/null || echo "")
if [ "$LATEST_TAG" != "$CURRENT_TAG" ]; then
echo "New version available: $LATEST_TAG"
curl -L -o open-ev-data.json "https://github.com/open-ev-data/open-ev-data-dataset/releases/download/$LATEST_TAG/open-ev-data-$LATEST_TAG.json"
echo "$LATEST_TAG" > .current_version
systemctl restart my-app
fi
License¶
This dataset is released under CDLA-Permissive-2.0.